Goto

Collaborating Authors

 grammatical evolution


Model Discovery with Grammatical Evolution. An Experiment with Prime Numbers

Skrzyński, Jakub, Sepioło, Dominik, Ligęza, Antoni

arXiv.org Artificial Intelligence

Machine Learning produces efficient decision and prediction models based on input-output data only. Such models have the form of decision trees or neural nets and are far from transparent analytical models, based on mathematical formulas. Analytical model discovery requires additional knowledge and may be performed with Grammatical Evolution. Such models are transparent, concise, and have readable components and structure. This paper reports on a non-trivial experiment with generating such models.


Task-Specific Activation Functions for Neuroevolution using Grammatical Evolution

Winter, Benjamin David, Teahan, William John

arXiv.org Artificial Intelligence

Activation functions play a critical role in the performance and behaviour of neural networks, significantly impacting their ability to learn and generalise. Traditional activation functions, such as ReLU, sigmoid, and tanh, have been widely used with considerable success. However, these functions may not always provide optimal performance for all tasks and datasets. In this paper, we introduce Neuvo GEAF - an innovative approach leveraging grammatical evolution (GE) to automatically evolve novel activation functions tailored to specific neural network architectures and datasets. Experiments conducted on well-known binary classification datasets show statistically significant improvements in F1-score (between 2.4% and 9.4%) over ReLU using identical network architectures. Notably, these performance gains were achieved without increasing the network's parameter count, supporting the trend toward more efficient neural networks that can operate effectively on resource-constrained edge devices. This paper's findings suggest that evolved activation functions can provide significant performance improvements for compact networks while maintaining energy efficiency during both training and inference phases.


Interpretable Solutions for Breast Cancer Diagnosis with Grammatical Evolution and Data Augmentation

Hasan, Yumnah, de Lima, Allan, Amerehi, Fatemeh, de Bulnes, Darian Reyes Fernandez, Healy, Patrick, Ryan, Conor

arXiv.org Artificial Intelligence

Medical imaging diagnosis increasingly relies on Machine Learning (ML) models. This is a task that is often hampered by severely imbalanced datasets, where positive cases can be quite rare. Their use is further compromised by their limited interpretability, which is becoming increasingly important. While post-hoc interpretability techniques such as SHAP and LIME have been used with some success on so-called black box models, the use of inherently understandable models makes such endeavors more fruitful. This paper addresses these issues by demonstrating how a relatively new synthetic data generation technique, STEM, can be used to produce data to train models produced by Grammatical Evolution (GE) that are inherently understandable. STEM is a recently introduced combination of the Synthetic Minority Oversampling Technique (SMOTE), Edited Nearest Neighbour (ENN), and Mixup; it has previously been successfully used to tackle both between class and within class imbalance issues. We test our technique on the Digital Database for Screening Mammography (DDSM) and the Wisconsin Breast Cancer (WBC) datasets and compare Area Under the Curve (AUC) results with an ensemble of the top three performing classifiers from a set of eight standard ML classifiers with varying degrees of interpretability. We demonstrate that the GE-derived models present the best AUC while still maintaining interpretable solutions.


Automatic Design of Semantic Similarity Ensembles Using Grammatical Evolution

Martinez-Gil, Jorge

arXiv.org Artificial Intelligence

Semantic similarity measures are widely used in natural language processing to catalyze various computer-related tasks. However, no single semantic similarity measure is the most appropriate for all tasks, and researchers often use ensemble strategies to ensure performance. This research work proposes a method for automatically designing semantic similarity ensembles. In fact, our proposed method uses grammatical evolution, for the first time, to automatically select and aggregate measures from a pool of candidates to create an ensemble that maximizes correlation to human judgment. The method is evaluated on several benchmark datasets and compared to state-of-the-art ensembles, showing that it can significantly improve similarity assessment accuracy and outperform existing methods in some cases. As a result, our research demonstrates the potential of using grammatical evolution to automatically compare text and prove the benefits of using ensembles for semantic similarity tasks. The source code that illustrates our approach can be downloaded from https://github.com/jorge-martinez-gil/sesige.


Learning Difference Equations with Structured Grammatical Evolution for Postprandial Glycaemia Prediction

Parra, Daniel, Joedicke, David, Velasco, J. Manuel, Kronberger, Gabriel, Hidalgo, J. Ignacio

arXiv.org Artificial Intelligence

People with diabetes must carefully monitor their blood glucose levels, especially after eating. Blood glucose regulation requires a proper combination of food intake and insulin boluses. Glucose prediction is vital to avoid dangerous post-meal complications in treating individuals with diabetes. Although traditional methods, such as artificial neural networks, have shown high accuracy rates, sometimes they are not suitable for developing personalised treatments by physicians due to their lack of interpretability. In this study, we propose a novel glucose prediction method emphasising interpretability: Interpretable Sparse Identification by Grammatical Evolution. Combined with a previous clustering stage, our approach provides finite difference equations to predict postprandial glucose levels up to two hours after meals. We divide the dataset into four-hour segments and perform clustering based on blood glucose values for the twohour window before the meal. Prediction models are trained for each cluster for the two-hour windows after meals, allowing predictions in 15-minute steps, yielding up to eight predictions at different time horizons. Prediction safety was evaluated based on Parkes Error Grid regions. Our technique produces safe predictions through explainable expressions, avoiding zones D (0.2% average) and E (0%) and reducing predictions on zone C (6.2%). In addition, our proposal has slightly better accuracy than other techniques, including sparse identification of non-linear dynamics and artificial neural networks. The results demonstrate that our proposal provides interpretable solutions without sacrificing prediction accuracy, offering a promising approach to glucose prediction in diabetes management that balances accuracy, interpretability, and computational efficiency.


Modular Grammatical Evolution for the Generation of Artificial Neural Networks

Soltanian, Khabat, Ebnenasir, Ali, Afsharchi, Mohsen

arXiv.org Artificial Intelligence

This paper presents a novel method, called Modular Grammatical Evolution (MGE), towards validating the hypothesis that restricting the solution space of NeuroEvolution to modular and simple neural networks enables the efficient generation of smaller and more structured neural networks while providing acceptable (and in some cases superior) accuracy on large data sets. MGE also enhances the state-of-the-art Grammatical Evolution (GE) methods in two directions. First, MGE's representation is modular in that each individual has a set of genes, and each gene is mapped to a neuron by grammatical rules. Second, the proposed representation mitigates two important drawbacks of GE, namely the low scalability and weak locality of representation, towards generating modular and multi-layer networks with a high number of neurons. We define and evaluate five different forms of structures with and without modularity using MGE and find single-layer modules with no coupling more productive. Our experiments demonstrate that modularity helps in finding better neural networks faster. We have validated the proposed method using ten well-known classification benchmarks with different sizes, feature counts, and output class count. Our experimental results indicate that MGE provides superior accuracy with respect to existing NeuroEvolution methods and returns classifiers that are significantly simpler than other machine learning generated classifiers. Finally, we empirically demonstrate that MGE outperforms other GE methods in terms of locality and scalability properties.


Rule-based Evolutionary Bayesian Learning

Botsas, Themistoklis, Mason, Lachlan R., Matar, Omar K., Pan, Indranil

arXiv.org Machine Learning

In our previous work, we introduced the rule-based Bayesian Regression, a methodology that leverages two concepts: (i) Bayesian inference, for the general framework and uncertainty quantification and (ii) rule-based systems for the incorporation of expert knowledge and intuition. The resulting method creates a penalty equivalent to a common Bayesian prior, but it also includes information that typically would not be available within a standard Bayesian context. In this work, we extend the aforementioned methodology with grammatical evolution, a symbolic genetic programming technique that we utilise for automating the rules' derivation. Our motivation is that grammatical evolution can potentially detect patterns from the data with valuable information, equivalent to that of expert knowledge. We illustrate the use of the rule-based Evolutionary Bayesian learning technique by applying it to synthetic as well as real data, and examine the results in terms of point predictions and associated uncertainty.


evoltree

#artificialintelligence

Both variants evolve variable-length DTs and perform a simultaneous optimization of the predictive performance (measured in terms of AUC) and model complexity (measured in terms of GE tree nodes). To handle big data, the GE methods include a training sampling and parallelism evaluation mechanism. Both variants both use PonyGE2 as GE engine, while EDTL uses sklearn DT. Solutions are represented as a numpy.where Due to this representation, current evoltree implementation only allows numerical attributes.


Evolutionary learning of interpretable decision trees

Custode, Leonardo Lucio, Iacca, Giovanni

arXiv.org Artificial Intelligence

Reinforcement learning techniques achieved human-level performance in several tasks in the last decade. However, in recent years, the need for interpretability emerged: we want to be able to understand how a system works and the reasons behind its decisions. Not only we need interpretability to assess the safety of the produced systems, we also need it to extract knowledge about unknown problems. While some techniques that optimize decision trees for reinforcement learning do exist, they usually employ greedy algorithms or they do not exploit the rewards given by the environment. This means that these techniques may easily get stuck in local optima. In this work, we propose a novel approach to interpretable reinforcement learning that uses decision trees. We present a two-level optimization scheme that combines the advantages of evolutionary algorithms with the advantages of Q-learning. This way we decompose the problem into two sub-problems: the problem of finding a meaningful and useful decomposition of the state space, and the problem of associating an action to each state. We test the proposed method on three well-known reinforcement learning benchmarks, on which it results competitive with respect to the state-of-the-art in both performance and interpretability. Finally, we perform an ablation study that confirms that using the two-level optimization scheme gives a boost in performance in non-trivial environments with respect to a one-layer optimization technique.


Probabilistic Grammatical Evolution

Mégane, Jessica, Lourenço, Nuno, Machado, Penousal

arXiv.org Artificial Intelligence

Grammatical Evolution (GE) is one of the most popular Genetic Programming (GP) variants, and it has been used with success in several problem domains. Since the original proposal, many enhancements have been proposed to GE in order to address some of its main issues and improve its performance. In this paper we propose Probabilistic Grammatical Evolution (PGE), which introduces a new genotypic representation and new mapping mechanism for GE. Specifically, we resort to a Probabilistic Context-Free Grammar (PCFG) where its probabilities are adapted during the evolutionary process, taking into account the productions chosen to construct the fittest individual. The genotype is a list of real values, where each value represents the likelihood of selecting a derivation rule. We evaluate the performance of PGE in two regression problems and compare it with GE and Structured Grammatical Evolution (SGE). The results show that PGE has a a better performance than GE, with statistically significant differences, and achieved similar performance when comparing with SGE.